Improved Alignment Models for Statistical Machine Translation
نویسندگان
چکیده
In this paper, we describe improved alignment models for statistical machine translation. The statistical translation approach uses two types of information: a translation model and a language model. The language model used is a bigram or general m-gram model. The translation model is decomposed into a lexical and an alignment model. We describe two different approaches for statistical translation and present experimental results. The first approach is based on dependencies between single words, the second approach explicitly takes shallow phrase structures into account, using two different alignment levels: a phrase level alignment between phrases and a word level alignment between single words. We present results using the Verbmobil task (German-English, 6000word vocabulary) which is a limited-domain spoken-language task. The experimental tests were performed on both the text transcription and the speech recognizer output. 1 S t a t i s t i c a l M a c h i n e T r a n s l a t i o n The goal of machine translation is the translation of a text given in some source language into a target language. We are given a source string f / = fl...fj...fJ, which is to be translated into a target string e{ = el...ei...ex. Among all possible target strings, we will choose the string with the highest probability: = argmax {Pr(ezIlflJ)}
منابع مشابه
Improved Discriminative Bilingual Word Alignment
For many years, statistical machine translation relied on generative models to provide bilingual word alignments. In 2005, several independent efforts showed that discriminative models could be used to enhance or replace the standard generative approach. Building on this work, we demonstrate substantial improvement in word-alignment accuracy, partly though improved training methods, but predomi...
متن کاملA Source-side Decoding Sequence Model for Statistical Machine Translation
We propose a source-side decoding sequence language model for phrase-based statistical machine translation. This model is a reordering model in the sense that it helps the decoder find the correct decoding sequence. The model uses word-aligned bilingual training data. We show improved translation quality of up to 1.34% BLEU and 0.54% TER using this model compared to three other widely used reor...
متن کاملExtensions to HMM-based Statistical Word Alignment Models
This paper describes improved HMM-based word level alignment models for statistical machine translation. We present a method for using part of speech tag information to improve alignment accuracy, and an approach to modeling fertility and correspondence to the empty word in an HMM alignment model. We present accuracy results from evaluating Viterbi alignments against human-judged alignments on ...
متن کاملExtentions to HMM-based Statistical Word Alignment Models
This paper describes improved HMM-based word level alignment models for statistical machine translation. We present a method for using part of speech tag information to improve alignment accuracy, and an approach to modeling fertility and correspondence to the empty word in an HMM alignment model. We present accuracy results from evaluating Viterbi alignments against human-judged alignments on ...
متن کاملRefining Kazakh Word Alignment Using Simulation Modeling Methods for Statistical Machine Translation
Word alignment play an important role in the training of statistical machine translation systems. We present a technique to refine word alignments at phrase level after the collection of sentences from the Kazakh-English parallel corpora. The estimation technique extracts the phrase pairs from the word alignment and then incorporates them into the translation system for further steps. Although ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999